Skip to content

Docs refresh: trim README to a 188-line landing page; redirect contributor conventions#372

Merged
igerber merged 13 commits intomainfrom
docs-refresh
Apr 25, 2026
Merged

Docs refresh: trim README to a 188-line landing page; redirect contributor conventions#372
igerber merged 13 commits intomainfrom
docs-refresh

Conversation

@igerber
Copy link
Copy Markdown
Owner

@igerber igerber commented Apr 25, 2026

Summary

  • README rewritten from 3,119 → 188 lines (-94%) as a landing page: hero image, 6 badges (preserved), tightened tagline, lean estimator catalog, 14 H2 sections (down from 17). PyPI snippet now scannable.
  • New docs/references.rst lifts the 137-line bibliography out of README into a dedicated RTD page (added to docs/index.rst toctree).
  • New repo-root llms.txt for AI-crawler discovery; mirrors the RTD-hosted llms.txt / llms-full.txt / llms-practitioner.txt and lists the in-process get_llm_guide() modes.
  • diff-diff.png (1731x909 hero asset) committed to git so the absolute raw.githubusercontent.com URL resolves on PyPI and on GitHub main post-merge.
  • Convention redirects to prevent regression: CONTRIBUTING.md "README is a landing page" rewrite, CLAUDE.md new ## README discipline section, docs-check.md per-estimator README requirement replaced with llms.txt + flat-catalog requirement, dev-checklists.md / pre-merge-check.md / review-pr.md / review-plan.md Documentation Sync sections updated. docs/doc-deps.yaml 18 stale README.md / section: "<EstimatorName>" pointers redirected to the new flat-catalog convention.

Methodology references (required if estimator / math changes)

  • N/A - no methodology, math, or estimator changes. Pure docs + workflow + asset PR.

Validation

  • Tests added/updated: none (no methodology changes).
  • README is 188 lines (target 165-220) with 14 H2 sections (target 14).
  • All 30 RTD links in README resolve 200 (excluding new references.html which resolves after first RTD rebuild post-merge).
  • All 3 RTD links in /llms.txt resolve 200.
  • get_llm_guide() smoke-tested and unchanged (13,166 chars).
  • PyPI readme-renderer produces valid HTML with hero image and all 14 H2s.
  • Hero image embedded in first 20 lines with all 4 marquee estimator names ("Callaway-Sant'Anna", "Synthetic DiD", "Honest DiD", "Event Studies") + "Difference-in-Differences" in alt text; blank line after </p> confirmed.
  • docs/references.rst parses cleanly via docutils (no warnings; full Sphinx build deferred to CI).
  • Zero em dashes across README.md, docs/references.rst, llms.txt.

Security / privacy

  • Confirm no secrets/PII in this PR: Yes (deterministic content + filename scans returned zero matches on staged changes).

@github-actions
Copy link
Copy Markdown

Overall Assessment

⚠️ Needs changes

No estimator/math/inference code changed, so I did not find new numerical-method defects. The blockers are in the new documentation contract itself: the PR makes docs/references.rst and the .claude review commands authoritative, but both are still internally inconsistent in ways that weaken source-material traceability and produce false review failures.

Executive Summary

  • No runtime methodology regressions found; this is a docs/workflow-only review.
  • docs/references.rst is incomplete relative to the public estimator/workflow surfaces it is now supposed to back.
  • .claude/commands/docs-check.md’s README extraction command is broken, and the new instructions still misplace Honest DiD under ## Estimators.
  • docs/doc-deps.yaml was only partially retargeted: HonestDiD still points to the wrong README section, and the map never surfaces docs/references.rst.
  • README still has relative links that work on GitHub but break on PyPI.

Methodology

  • Severity: P1. Impact: docs/references.rst is now declared the bibliography source of truth (CONTRIBUTING.md:L35-L38), but it omits several sources still surfaced in the public catalog/workflow: Baker et al. 2025 (diff_diff/guides/llms-practitioner.txt:L602-L606), dCDH’s dynamic companion (docs/methodology/REGISTRY.md:L475-L476), ContinuousDiD (docs/methodology/REGISTRY.md:L679), EfficientDiD (docs/methodology/REGISTRY.md:L751), and Wooldridge ETWFE (docs/methodology/REGISTRY.md:L1304-L1306), even though README still advertises those methods (README.md:L106-L117). That leaves the new “source of truth” incomplete for source-material traceability. Concrete fix: add those full citations to docs/references.rst and align titles/years with the Registry.
  • No other methodology findings; I did not find estimator/math/SE/default-behavior changes in this diff.

Code Quality

  • Severity: P1. Impact: the redirected docs-review workflow is broken in .claude/commands/docs-check.md:L116-L129. The published command awk '/^## Estimators/,/^## /' README.md self-terminates on the opening H2, so it returns only the heading on this README. The same section also tells reviewers to look under ## Estimators for every row even though the file itself marks HonestDiD as a diagnostics exception (.claude/commands/docs-check.md:L55-L56), and the actual surfaces put it under diagnostics (README.md:L120-L126, diff_diff/guides/llms.txt:L71-L77). Concrete fix: replace the range command with one that skips the opening heading and stops at the next H2, and explicitly special-case diagnostic-class features/names.

Performance

  • No findings.

Maintainability

  • Severity: P2. Impact: docs/doc-deps.yaml was only partially updated. HonestDiD still points to README.md "Estimators (one-line catalog entry)" (docs/doc-deps.yaml:L428-L440), and the map has no docs/references.rst dependencies even though /docs-impact and pre-merge warnings rely on it (.claude/commands/docs-impact.md:L8-L20, .claude/commands/pre-merge-check.md:L112-L134). Future source changes will not prompt bibliography updates and will point reviewers at the wrong README section. Concrete fix: retarget HonestDiD to the diagnostics section and add docs/references.rst as a dependent doc for the method-backed source modules.

Tech Debt

  • No mitigating TODO.md entry covers the issues above, so they remain untracked/unmitigated.

Security

  • No findings.

Documentation/Tests

  • Severity: P2. Impact: the new landing page still contains relative links at README.md:L83 and README.md:L182-L184 (docs/methodology/REPORTING.md, CITATION.cff, CONTRIBUTORS.md). These work on GitHub but break on PyPI, which is one of the main target surfaces for this rewrite. Concrete fix: use absolute GitHub or RTD URLs for those links.
  • Severity: P3. Impact: README.md:L54 says there are 18 tutorials, but the repo currently has 19 notebooks under docs/tutorials/. Concrete fix: update the count or avoid hard-coding it.

Path to Approval

  1. Fill in docs/references.rst with the missing public-method/workflow citations and cross-check them against docs/methodology/REGISTRY.md.
  2. Fix .claude/commands/docs-check.md so the README catalog extraction actually works and diagnostics-class items, especially Honest DiD, are checked in the correct section with the correct display name.

igerber added a commit that referenced this pull request Apr 25, 2026
P1 - docs/references.rst incomplete (5 missing citations)
- Add Baker, Larcker, Wang & Wang (2025) - "Difference-in-Differences Designs:
  A Practitioner's Guide" arXiv:2503.13323; backs the 8-step Practitioner
  Workflow now surfaced in README and llms-practitioner.txt.
- Add Callaway, Goodman-Bacon & Sant'Anna (2024) - "Difference-in-Differences
  with a Continuous Treatment" NBER WP 32117; backs ContinuousDiD (under new
  "Continuous Treatment DiD" section).
- Add Chen, Sant'Anna & Xie (2025) - "Efficient Difference-in-Differences and
  Event Study Estimators"; backs EfficientDiD.
- Add de Chaisemartin & D'Haultfoeuille (2022, revised 2024) NBER WP 29873;
  dynamic companion to dCDH 2020, backs the cohort-recentered plug-in
  variance formula.
- Add Wooldridge (2025) Empirical Economics + Wooldridge (2023) Econometrics
  Journal under new "Wooldridge ETWFE" sub-section; backs WooldridgeDiD's
  saturated OLS and logit/Poisson QMLE paths.

P1 - .claude/commands/docs-check.md
- Replace broken `awk '/^## Estimators/,/^## /'` (self-terminates on opening
  H2) with a flag-based extractor that skips the heading and stops at the
  next H2.
- Special-case diagnostic-class items (HonestDiD): the workflow text now
  routes them to `## Diagnostics & Sensitivity` in both llms.txt and README,
  matching the existing table.

P2 - docs/doc-deps.yaml
- Retarget HonestDiD's README entry from "Estimators (one-line catalog
  entry)" to "Diagnostics & Sensitivity (one-line entry)" so /docs-impact
  flags the correct README section.
- Add `- path: docs/references.rst` (type: user_guide) to all 18 method-
  backed source modules so /docs-impact and /pre-merge-check warnings
  surface bibliography updates when source code changes.

P2 - README.md absolute links
- L83 docs/methodology/REPORTING.md, L182 CITATION.cff, L184 CONTRIBUTORS.md
  switched to absolute https://github.com/igerber/diff-diff/blob/main/...
  URLs so they resolve on PyPI (which does not auto-resolve relative paths).

P3 - README.md tutorial count
- Drop the "18" hard-coded count in the Tutorials bullet to prevent future
  drift; repo currently has 19 notebooks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 25, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: dc443d5537fa3155d357abb35f0637cdd56b5123


Overall Assessment

✅ Looks good

This re-review resolves the prior P1-level concerns. I did not find any unmitigated P0/P1 issues in the PR diff; the remaining items are P2 documentation-tooling consistency gaps.

Executive Summary

  • Previous blockers appear addressed: docs/references.rst now includes the previously missing public-method citations, the README’s PyPI-facing links are now absolute, and /docs-check no longer uses the broken self-terminating awk range form (docs/references.rst:L22-L31, docs/references.rst:L145-L166, README.md:L83, README.md:L180-L184, .claude/commands/docs-check.md:L128-L147).
  • No estimator code, inference code, defaults, weighting, or identification logic changed in this PR, so I found no new numerical-methodology defects.
  • The Baker et al. workflow summary added to the README is aligned with the project’s documented adaptation of that paper, and the deviations are already recorded in docs/methodology/REGISTRY.md, so I am not treating that as a defect (README.md:L85-L98, docs/methodology/REGISTRY.md:L3245-L3262).
  • Two minor workflow-contract mismatches remain: one in /docs-check’s StaggeredTripleDifference API-doc target, and one in docs/doc-deps.yaml’s partial migration to the new doc-surface contract.

Methodology

  • No unmitigated findings. This PR does not change estimator math, SEs, assumptions, or defaults. The only method-adjacent addition is the README practitioner workflow summary, and its diff-diff-specific adaptation from Baker et al. is explicitly documented in docs/methodology/REGISTRY.md:L3245-L3262.

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • Severity: P2 [Newly identified]. Impact: docs/doc-deps.yaml is still only partially migrated to the new “llms.txt + references + landing-page README” contract. For example, diff_diff/staggered_triple_diff.py still maps only methodology/API docs (docs/doc-deps.yaml:L165-L172), and diff_diff/sun_abraham.py still omits diff_diff/guides/llms.txt (docs/doc-deps.yaml:L176-L198), even though the new contributor contract treats those surfaces as required (.claude/commands/docs-check.md:L35-L56, CONTRIBUTING.md:L21-L48). That means /docs-impact will still under-report required documentation follow-ups for some public estimators. Concrete fix: backfill the missing README.md, diff_diff/guides/llms.txt, docs/references.rst, and related public-surface mappings for every estimator listed in the new /docs-check table, starting with StaggeredTripleDifference and SunAbraham.

Tech Debt

  • No separate findings. The maintainability gap above is not tracked in TODO.md, but it is P2 workflow debt, not a blocker under this rubric.

Security

  • No findings.

Documentation/Tests

  • Severity: P2. Impact: .claude/commands/docs-check.md now says StaggeredTripleDifference’s API documentation target is index.rst (.claude/commands/docs-check.md:L51-L53), but docs/api/index.rst does not list diff_diff.StaggeredTripleDifference in its estimator autosummary (docs/api/index.rst:L11-L30), and docs/api/staggered.rst only documents Callaway-Sant’Anna and Sun-Abraham (docs/api/staggered.rst:L15-L125). As written, /docs-check can produce a false pass for that estimator’s API-doc requirement. Concrete fix: either add a real StaggeredTripleDifference API entry to the docs and point /docs-check at that file, or remove that row from the required-doc table until the API docs exist.
  • No runtime or methodology tests were expected for this docs-only PR. I did not run a Sphinx/docutils build in this environment.

@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 25, 2026

/ai-review

igerber added a commit that referenced this pull request Apr 25, 2026
P2 - docs/doc-deps.yaml partially migrated to new doc-surface contract
- Add `diff_diff/guides/llms.txt` entries to 12 estimator blocks that
  already had llms-full.txt (CallawaySantAnna, SunAbraham, ImputationDiD,
  TwoStageDiD, EfficientDiD, ContinuousDiD, StackedDiD, SyntheticDiD,
  TripleDifference, TROP, HonestDiD, WooldridgeDiD, ChaisemartinDHaultfoeuille,
  Survey). Section labels match the actual llms.txt heading the entry
  belongs under: "Estimators", "Diagnostics and Sensitivity Analysis"
  (HonestDiD), "Survey Support" (survey).
- Backfill bacon.py with llms-full.txt + llms.txt entries (and the missing
  02_staggered_did tutorial reference for Bacon decomposition).
- Backfill staggered_triple_diff.py with full surface mapping
  (16_survey_did tutorial, README, references.rst, llms-full.txt, llms.txt).
- Net: all 17 method-backed estimator sources now have the required
  README + llms.txt + llms-full.txt + references.rst entries. BusinessReport
  / DiagnosticReport intentionally lack llms.txt entries (not in llms.txt
  per their experimental-preview status).

P2 - StaggeredTripleDifference API doc target
- Add `.. autoclass:: diff_diff.StaggeredTripleDifference` and
  `StaggeredTripleDiffResults` sections to docs/api/staggered.rst.
- Update docs-check.md table row from index.rst (where it was a false-pass)
  to staggered.rst (where the autoclass actually lives now).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 1e6eef7a7b37d92c7ac2b09f7c92ad03c9976682


Overall Assessment

✅ Looks good

I did not find any unmitigated P0/P1 issues in this re-review. The PR remains documentation-only, and the method-adjacent additions I checked are consistent with the Methodology Registry and existing guide files.

Executive Summary

  • Previous re-review blockers appear resolved: StaggeredTripleDifference now has an API docs section in docs/api/staggered.rst:L127-L143, the bibliography has been split into docs/references.rst:L1-L184, and that page is now wired into RTD via docs/index.rst:L65-L74.
  • No estimator code, inference code, weighting logic, defaults, or identification checks changed in this PR.
  • The new Baker-et-al. workflow summary in README.md:L85-L98 matches the documented diff-diff adaptation in docs/methodology/REGISTRY.md:L3247-L3262, so I am not treating it as a methodology deviation.
  • docs/doc-deps.yaml looks materially more aligned with the new doc-surface contract; the prior partial-migration concern is largely addressed in the changed ranges, including docs/doc-deps.yaml:L163-L186.
  • One P2 documentation navigation gap remains: the new README catalog entry for StaggeredTripleDifference points to api/index.html, but that index still does not list the estimator.
  • One minor P3 documentation consistency issue remains in the intro text of docs/api/staggered.rst.

Methodology

  • No findings. The only method-adjacent content added here is documentation: the practitioner workflow in README.md:L85-L98 and its bibliography entry in docs/references.rst:L157-L159. Both align with the documented Baker adaptation notes in docs/methodology/REGISTRY.md:L3247-L3262.

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings.

Tech Debt

  • No findings. The earlier doc-map drift concern appears substantially improved by the new docs/doc-deps.yaml entries, especially for the moved bibliography and StaggeredTripleDifference surfaces (docs/doc-deps.yaml:L163-L186).

Security

  • No findings. diff-diff.png appears to be a standard PNG asset; a basic metadata/string pass surfaced provenance metadata only, with no obvious secret-like material or PII.

Documentation/Tests

  • Severity: P2. Impact: The new README catalog entry for StaggeredTripleDifference links to api/index.html (README.md:L116), but the API index still does not list either diff_diff.StaggeredTripleDifference or diff_diff.StaggeredTripleDiffResults (docs/api/index.rst:L11-L30, docs/api/index.rst:L37-L65). The actual docs now live in docs/api/staggered.rst:L127-L143, and the refreshed docs contract also points there (.claude/commands/docs-check.md:L35-L56). As written, the landing-page catalog link does not take readers to the estimator docs they expect. Concrete fix: change the README link to api/staggered.html; ideally sync the same fix in diff_diff/guides/llms.txt:L67, or alternatively add the estimator and results class to docs/api/index.rst.
  • Severity: P3. Impact: docs/api/staggered.rst:L4-L11 still says the page provides “two main estimators” and lists only Callaway-Sant’Anna and Sun-Abraham, but the same page now also documents StaggeredTripleDifference (docs/api/staggered.rst:L127-L143). This is a minor accuracy drift in the page intro. Concrete fix: update the intro to describe three documented estimators, or narrow the sentence so it explicitly introduces only the first two sections.

I did not run a full Sphinx/docutils build in this environment.

@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 25, 2026

/ai-review

igerber added a commit that referenced this pull request Apr 25, 2026
P2 - StaggeredTripleDifference catalog link points at api/index.html which
does not list it
- Update README.md L116 catalog link from `api/index.html` to
  `api/staggered.html#staggeredtripledifference` (where the autoclass added
  in the prior commit actually renders).
- Sync the same fix in diff_diff/guides/llms.txt L67 so the bundled
  AI-agent guide and the README agree on the canonical URL.

P3 - docs/api/staggered.rst intro counted "two main estimators" but page
now also documents StaggeredTripleDifference (added in prior commit)
- Update intro to list three estimators (CS, SA, StaggeredTripleDifference)
  and clarify the robustness-check note applies to CS+SA only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: a2b2df76e978ea4509c60119a251d0ce69ccd0fa


Overall Assessment
✅ Looks good

Executive Summary

  • Previous re-review documentation issues are resolved: README.md:L116 and diff_diff/guides/llms.txt:L67 now point StaggeredTripleDifference directly to api/staggered.html#staggeredtripledifference, and docs/api/staggered.rst:L6-L12 now describes all three estimators.
  • No estimator code, weighting, variance/SE logic, defaults, or identification checks changed in this PR.
  • The added Baker workflow surface is consistent with the documented adaptation notes in docs/methodology/REGISTRY.md:L3247-L3261, and the new bibliography entry in docs/references.rst:L157-L159 is appropriate.
  • One P2 documentation inconsistency remains: the refreshed high-level workflow says HonestDiD applies to dCDH, but the detailed practitioner guide still says HonestDiD works only with MultiPeriodDiD and CallawaySantAnna.
  • One P3 maintainability nit remains in docs/doc-deps.yaml: several touched diff_diff/guides/llms.txt mappings still lack section metadata.

Methodology

  • Severity: P2. Impact: README.md:L94-L98 and diff_diff/guides/llms.txt:L23-L25 now advertise compute_honest_did(results) for MultiPeriodDiD, CallawaySantAnna, or dCDH, which matches the documented dCDH extension in docs/methodology/REGISTRY.md:L637. But diff_diff/guides/llms-practitioner.txt:L296-L300 still says HonestDiD works only with MultiPeriodDiD and CallawaySantAnna, even though the same guide’s estimator tree says dCDH supports HonestDiD at diff_diff/guides/llms-practitioner.txt:L177-L180. That leaves the methodology guidance internally inconsistent for reversible-treatment workflows. Concrete fix: update Step 6 in diff_diff/guides/llms-practitioner.txt to include dCDH with the registry caveats (L_max >= 1, placebo-based pre-periods, diagonal-variance warning), or remove dCDH from the refreshed high-level workflow surfaces if that support should not be advertised there.

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • Severity: P3. Impact: the refresh makes diff_diff/guides/llms.txt a first-class documentation surface, but several touched dependency-map entries still omit which llms section they belong to at docs/doc-deps.yaml:L104-L105, docs/doc-deps.yaml:L147-L148, and docs/doc-deps.yaml:L316-L317. That weakens the new map-validation contract because tooling can tell the file exists but not whether the mapping is to ## Estimators or another section. Concrete fix: add section: "Estimators" (or the appropriate section name) to the remaining unsectioned diff_diff/guides/llms.txt mappings touched by this PR.

Tech Debt

  • No findings. I did not see new deferrable work here that needed a TODO.md entry.

Security

  • No findings. diff-diff.png appears to be a normal PNG asset with provenance metadata only; I did not find secrets or PII in the changed text surfaces.

Documentation/Tests

  • No findings. I confirmed README.md is 188 lines with 14 H2s, and docs/index.rst:L70-L73 now wires references into the docs tree.
  • Verification note: I did not run a full Sphinx/docutils build in this environment because docutils is not installed.

@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 25, 2026

/ai-review

igerber added a commit that referenced this pull request Apr 25, 2026
P1 - docs/references.rst incomplete (5 missing citations)
- Add Baker, Larcker, Wang & Wang (2025) - "Difference-in-Differences Designs:
  A Practitioner's Guide" arXiv:2503.13323; backs the 8-step Practitioner
  Workflow now surfaced in README and llms-practitioner.txt.
- Add Callaway, Goodman-Bacon & Sant'Anna (2024) - "Difference-in-Differences
  with a Continuous Treatment" NBER WP 32117; backs ContinuousDiD (under new
  "Continuous Treatment DiD" section).
- Add Chen, Sant'Anna & Xie (2025) - "Efficient Difference-in-Differences and
  Event Study Estimators"; backs EfficientDiD.
- Add de Chaisemartin & D'Haultfoeuille (2022, revised 2024) NBER WP 29873;
  dynamic companion to dCDH 2020, backs the cohort-recentered plug-in
  variance formula.
- Add Wooldridge (2025) Empirical Economics + Wooldridge (2023) Econometrics
  Journal under new "Wooldridge ETWFE" sub-section; backs WooldridgeDiD's
  saturated OLS and logit/Poisson QMLE paths.

P1 - .claude/commands/docs-check.md
- Replace broken `awk '/^## Estimators/,/^## /'` (self-terminates on opening
  H2) with a flag-based extractor that skips the heading and stops at the
  next H2.
- Special-case diagnostic-class items (HonestDiD): the workflow text now
  routes them to `## Diagnostics & Sensitivity` in both llms.txt and README,
  matching the existing table.

P2 - docs/doc-deps.yaml
- Retarget HonestDiD's README entry from "Estimators (one-line catalog
  entry)" to "Diagnostics & Sensitivity (one-line entry)" so /docs-impact
  flags the correct README section.
- Add `- path: docs/references.rst` (type: user_guide) to all 18 method-
  backed source modules so /docs-impact and /pre-merge-check warnings
  surface bibliography updates when source code changes.

P2 - README.md absolute links
- L83 docs/methodology/REPORTING.md, L182 CITATION.cff, L184 CONTRIBUTORS.md
  switched to absolute https://github.com/igerber/diff-diff/blob/main/...
  URLs so they resolve on PyPI (which does not auto-resolve relative paths).

P3 - README.md tutorial count
- Drop the "18" hard-coded count in the Tutorials bullet to prevent future
  drift; repo currently has 19 notebooks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
igerber added a commit that referenced this pull request Apr 25, 2026
P2 - docs/doc-deps.yaml partially migrated to new doc-surface contract
- Add `diff_diff/guides/llms.txt` entries to 12 estimator blocks that
  already had llms-full.txt (CallawaySantAnna, SunAbraham, ImputationDiD,
  TwoStageDiD, EfficientDiD, ContinuousDiD, StackedDiD, SyntheticDiD,
  TripleDifference, TROP, HonestDiD, WooldridgeDiD, ChaisemartinDHaultfoeuille,
  Survey). Section labels match the actual llms.txt heading the entry
  belongs under: "Estimators", "Diagnostics and Sensitivity Analysis"
  (HonestDiD), "Survey Support" (survey).
- Backfill bacon.py with llms-full.txt + llms.txt entries (and the missing
  02_staggered_did tutorial reference for Bacon decomposition).
- Backfill staggered_triple_diff.py with full surface mapping
  (16_survey_did tutorial, README, references.rst, llms-full.txt, llms.txt).
- Net: all 17 method-backed estimator sources now have the required
  README + llms.txt + llms-full.txt + references.rst entries. BusinessReport
  / DiagnosticReport intentionally lack llms.txt entries (not in llms.txt
  per their experimental-preview status).

P2 - StaggeredTripleDifference API doc target
- Add `.. autoclass:: diff_diff.StaggeredTripleDifference` and
  `StaggeredTripleDiffResults` sections to docs/api/staggered.rst.
- Update docs-check.md table row from index.rst (where it was a false-pass)
  to staggered.rst (where the autoclass actually lives now).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
igerber added a commit that referenced this pull request Apr 25, 2026
P2 - StaggeredTripleDifference catalog link points at api/index.html which
does not list it
- Update README.md L116 catalog link from `api/index.html` to
  `api/staggered.html#staggeredtripledifference` (where the autoclass added
  in the prior commit actually renders).
- Sync the same fix in diff_diff/guides/llms.txt L67 so the bundled
  AI-agent guide and the README agree on the canonical URL.

P3 - docs/api/staggered.rst intro counted "two main estimators" but page
now also documents StaggeredTripleDifference (added in prior commit)
- Update intro to list three estimators (CS, SA, StaggeredTripleDifference)
  and clarify the robustness-check note applies to CS+SA only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
igerber added a commit that referenced this pull request Apr 25, 2026
(Branch rebased onto origin/main first - 32 landed commits replayed cleanly,
zero file overlap with the docs-refresh changeset.)

P2 - llms-practitioner.txt Step 6 inconsistent on dCDH HonestDiD support
- The high-level workflow surfaces (README, bundled llms.txt) and the
  practitioner guide's estimator tree (L177-180) all advertise
  compute_honest_did(results) on MultiPeriodDiD / CallawaySantAnna /
  ChaisemartinDHaultfoeuille, but Step 6 still said "MultiPeriodDiD and
  CallawaySantAnna only".
- Update Step 6 to add dCDH with the registry caveats: requires L_max >= 1,
  bounds use placebo estimates as pre-period coefficients (not standard
  event-study pre-treatment coefficients), uses diagonal variance (no full
  VCV available), emits UserWarning. Cross-references the REGISTRY.md
  "Note (HonestDiD integration)" entry under ChaisemartinDHaultfoeuille
  for the full caveat list.

P3 - docs/doc-deps.yaml unsectioned llms.txt entries
- Add `section: "Estimators"` to the 5 remaining unsectioned
  `diff_diff/guides/llms.txt` mappings (estimators.py, staggered.py,
  chaisemartin_dhaultfoeuille.py, results.py, __init__.py). Now every
  llms.txt mapping in doc-deps.yaml carries a section label, matching the
  new "first-class documentation surface" contract.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 44f263240c64502db158f866e5032369dc28f1d5


Overall Assessment

✅ Looks good

Executive Summary

  • Re-review: the prior Step 6 HonestDiD inconsistency is resolved. diff_diff/guides/llms-practitioner.txt:L292-L308 now matches the dCDH caveats documented in docs/methodology/REGISTRY.md:L637-L640.
  • The prior StaggeredTripleDifference doc-link/API coverage issues are resolved in README.md:L100-L118, diff_diff/guides/llms.txt:L58-L69, and docs/api/staggered.rst:L6-L145.
  • The prior docs/doc-deps.yaml section-metadata gaps for touched diff_diff/guides/llms.txt mappings appear resolved; I did not find remaining touched entries without a section.
  • One P2 documentation/source-material issue remains: docs/references.rst:L157-L159 misattributes the Baker practitioner-guide paper, even though this PR makes docs/references.rst the bibliography source of truth via .claude/commands/docs-check.md:L21-L29 and CONTRIBUTING.md:L35-L37.
  • No estimator logic, weighting, variance/SE, defaults, or assumption checks changed in this PR.

Methodology

  • Severity: P2. Impact: docs/references.rst:L157-L159 attributes arXiv 2503.13323 to Baker, Larcker, Wang, Wang, but the in-repo practitioner guide already cites Andrew Baker, Brantly Callaway, Scott Cunningham, Andrew Goodman-Bacon, Pedro H. C. Sant'Anna at diff_diff/guides/llms-practitioner.txt:L610-L614, which matches the paper metadata. Because this PR also redirects contributor workflows to treat docs/references.rst as the canonical bibliography, the refreshed docs flow would propagate an incorrect scholarly citation. Concrete fix: replace the author list in docs/references.rst:L157-L159 with the actual arXiv authors while keeping the title and arXiv ID unchanged. (ideas.repec.org)

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings.

Tech Debt

  • No findings. I did not see a new deferrable item that needs TODO.md.

Security

  • No findings. diff-diff.png is a normal PNG asset, and I did not see secrets or executable content in the changed files.

Documentation/Tests

  • No additional findings.
  • I did not run a full Sphinx/docutils build because docutils and sphinx are not installed in this environment.
  • Manual verification only: README.md is 188 lines with 14 H2 headings, and docs/index.rst:L65-L74 adds references to the docs toctree.

igerber added a commit that referenced this pull request Apr 25, 2026
P2 - docs/references.rst misattributed Baker (2025) "A Practitioner's Guide"
- Wrong author list "Baker, A. C., Larcker, D. F., Wang, C. C. Y., & Wang, X."
  conflates the 2022 JFE staggered-DiD paper (Baker/Larcker/Wang) with the
  2025 arXiv:2503.13323 practitioner's guide. Per the canonical citation in
  diff_diff/guides/llms-practitioner.txt L610-614 and arXiv metadata, the
  actual authors of arXiv:2503.13323 are Baker, Callaway, Cunningham,
  Goodman-Bacon & Sant'Anna.
- Fix: replace the author list (title and arXiv ID unchanged).
- Audit: grep for `Baker.*Larcker|Larcker.*Wang` across docs/, README.md,
  llms.txt, diff_diff/guides/, CHANGELOG.md, CLAUDE.md, CONTRIBUTING.md
  shows no other misattributions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 25, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 61841a1cfe40ae1deb7b88725b41e258e740d127


Overall Assessment

✅ Looks good. All prior re-review findings appear addressed. The highest remaining unmitigated issue is a P2 documentation-contract gap, which is compatible with ✅ under the stated criteria.

Executive Summary

  • The prior bibliography issue is fixed: docs/references.rst:L157-L159 now attributes the Baker practitioner-guide paper correctly.
  • The prior dCDH HonestDiD wording issue remains resolved: diff_diff/guides/llms-practitioner.txt:L296-L308 matches the caveat in docs/methodology/REGISTRY.md:L637-L641.
  • The prior StaggeredTripleDifference discoverability/link issue is resolved in the touched surfaces: diff_diff/guides/llms.txt:L67-L67 and docs/api/staggered.rst:L128-L145.
  • Severity: P2 [Newly identified]. The new canonical docs surfaces still omit public HeterogeneousAdoptionDiD, so the refresh institutionalizes an incomplete estimator catalog (README.md:L100-L118, diff_diff/guides/llms.txt:L51-L69, .claude/commands/docs-check.md:L35-L56 vs diff_diff/__init__.py:L64-L67 and diff_diff/__init__.py:L467-L470).
  • No estimator logic, weighting, variance/SE, defaults, or assumption checks changed in this PR.

Methodology

  • No findings. The touched methodology-adjacent text is consistent with the Methodology Registry, and I did not find a new undocumented deviation from source material.

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings.

Tech Debt

  • No new findings. Existing deferred HAD docs work is already tracked at TODO.md:L108-L108.

Security

  • No findings.

Documentation/Tests

  • Severity: P2 [Newly identified]. Impact: this PR redefines README.md, diff_diff/guides/llms.txt, and .claude/commands/docs-check.md as the authoritative public-doc surfaces, but all three omit HeterogeneousAdoptionDiD even though it is a shipped public estimator (diff_diff/__init__.py:L64-L67, diff_diff/__init__.py:L467-L470) with a methodology section in docs/methodology/REGISTRY.md:L2251-L2255; it is also absent from the API index autosummary at docs/api/index.rst:L11-L30 and docs/api/index.rst:L37-L65. That means users, AI agents, and future docs sweeps will continue to miss a real public estimator. Concrete fix: add HeterogeneousAdoptionDiD to the flat estimator catalogs in README.md and diff_diff/guides/llms.txt, add it to .claude/commands/docs-check.md’s required-estimators table, add the citation to docs/references.rst, add API-index coverage in docs/api/index.rst plus a real API page/autosummary target for the class/results, and add the corresponding docs/doc-deps.yaml mapping for diff_diff/had.py. TODO.md:L108-L108 only tracks future llms.txt/tutorial work; it does not justify baking the omission into the required-docs contract.
  • Manual verification only. I did not run a full Sphinx/docutils build because docutils is unavailable in this environment.

igerber added a commit that referenced this pull request Apr 25, 2026
…cross all canonical surfaces

P2 - HeterogeneousAdoptionDiD shipped (Phases 1a-3 + 3-followup + 4.5 A/B/C0
merged via PRs #366/#367) but missing from the new authoritative documentation
surfaces this PR establishes
- README.md ## Estimators catalog: add one-line entry after ContinuousDiD
  (de Chaisemartin, Ciccia, D'Haultfoeuille & Knau 2026; alias HAD).
- diff_diff/guides/llms.txt ## Estimators: add matching one-liner.
- .claude/commands/docs-check.md required-estimators table: add HAD row
  pointing at had.rst as the API target.
- docs/references.rst: add new "Heterogeneous Adoption (No-Untreated
  Designs)" sub-section with the de Chaisemartin et al. (2026)
  arXiv:2405.04465v6 citation.
- docs/api/index.rst: add HeterogeneousAdoptionDiD to estimators
  autosummary; add HeterogeneousAdoptionDiDResults +
  HeterogeneousAdoptionDiDEventStudyResults to results autosummary.
- docs/api/had.rst (NEW): autoclass page for the three classes with a
  brief intro and a "When to use HAD" note pointing at sibling estimators
  (ContinuousDiD for never-treated controls, dCDH for binary reversible).
- docs/doc-deps.yaml: add diff_diff/had.py + diff_diff/had_pretests.py
  source mappings (REGISTRY methodology, had.rst api_reference, README
  catalog, references.rst, llms.txt). The llms-full.txt mapping is
  intentionally omitted with a comment - that section is deferred to the
  Phase 5 follow-up tracked in TODO.md.

TODO.md: narrow the Phase 5 entry from "llms.txt updates" to "llms-full.txt
HeterogeneousAdoptionDiD section" since the catalog one-liner, API page,
and bibliography landed here. practitioner_next_steps() integration and
tutorial notebook remain deferred.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 25, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 709e1af2b90577c36d964fe11922c597c1888f2f


Overall Assessment

⚠️ Needs changes

Executive Summary

  • Prior re-review completeness gaps are mostly addressed: HAD now appears in the README catalog, bundled llms.txt, a dedicated API page, and docs/references.rst, and the StaggeredTripleDifference link now targets the correct anchor (README.md:L113-L117, diff_diff/guides/llms.txt:L64-L68, docs/api/had.rst:L1-L58, docs/references.rst:L157-L171).
  • No estimator code, weighting logic, variance code, defaults, or assumption checks changed in this PR.
  • Severity P1: the new HAD API/docs copy materially misstates the estimator’s applicability and inference surface relative to docs/methodology/REGISTRY.md and diff_diff/had.py.
  • Severity P3: docs/api/had.rst cites Calonico/Cattaneo/Titiunik (2014), but docs/references.rst does not include that citation.
  • Severity P3: docs/api/index.rst adds HAD autosummary entries but still omits had.rst from the API module toctree, so the new page is not navigable from the RTD API landing page.

Methodology

  • Severity: P1. Impact: HeterogeneousAdoptionDiD is now described in ways that disagree with the shipped contract. docs/api/had.rst says the method is for panels where every post-period dose is strictly positive and there is no untreated comparison group, advertises HC2/Bell-McCaffrey SEs, and says the Appendix B.2 event-study produces correlated SEs and sup-t bands. The registry/code say the opposite: Design 1' explicitly allows d.min() == 0, the event-study last-cohort path retains never-treated units, the mass-point path only exposes classical/hc1 while continuous paths use CCT robust SE, and event-study inference is pointwise per horizon with no joint cross-horizon covariance; sup-t is only available on weighted event-study fits. This is not a documented deviation in REGISTRY.md. References: docs/api/had.rst:L4-L30, docs/methodology/REGISTRY.md:L2504-L2508, docs/methodology/REGISTRY.md:L2514-L2518, diff_diff/had.py:L542-L562, diff_diff/had.py:L2568-L2576, diff_diff/had.py:L2817-L2837, README.md:L113-L113, diff_diff/guides/llms.txt:L64-L64. Concrete fix: rewrite the HAD descriptions in docs/api/had.rst and the matching one-line catalog entries in README.md / diff_diff/guides/llms.txt to match the registry exactly: nonnegative-dose boundary handling, last-cohort-plus-never-treated event-study support, pointwise per-horizon CIs by default, and sup-t only for weighted event-study with cband=True.

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • Severity: P3. Impact: the new docs/api/had.rst page is not added to the API landing page’s module-documentation toctree, so RTD navigation still has no path from api/index.html to the new narrative HAD page even though direct links exist elsewhere. References: docs/api/index.rst:L226-L246, docs/api/had.rst:L1-L58. Concrete fix: add had under Module Documentation -> Estimators in docs/api/index.rst.

Tech Debt

  • No findings. The remaining HAD llms-full.txt / tutorial / practitioner_next_steps() follow-up is correctly tracked in TODO.md:L108-L108 and is non-blocking.

Security

  • No findings.

Documentation/Tests

  • Severity: P3. Impact: docs/api/had.rst newly cites Calonico/Cattaneo/Titiunik (2014), but the new bibliography source of truth does not contain that entry, so the docs-refresh contract is only partially applied. References: docs/api/had.rst:L17-L18, docs/references.rst:L1-L191, CONTRIBUTING.md:L26-L37. Concrete fix: add the exact citation to docs/references.rst, or remove that scholarly reference from docs/api/had.rst.
  • Manual review only. I did not run a Sphinx/docutils build in this environment.

Path to Approval

  1. Rewrite the HAD copy in docs/api/had.rst and the matching HAD catalog lines in README.md and diff_diff/guides/llms.txt so they match the shipped methodology contract in docs/methodology/REGISTRY.md: allow the Design 1' zero-boundary path, stop promising HC2/Bell-McCaffrey or correlated event-study SEs, and restrict sup-t language to weighted event-study fits with cband=True.

igerber added a commit that referenced this pull request Apr 25, 2026
…ctree

P1 - HAD descriptions disagreed with REGISTRY.md and shipped code
- Reviewer caught factual errors I introduced when drafting HAD copy:
  - "Strictly positive heterogeneous dose" excludes Design 1' (the QUG
    case where d̲ = 0 is allowed; mass-point fraction at d̲ matters
    for the support-plausibility heuristic but `d.min() == 0` is the
    canonical Design 1' invariant per REGISTRY L2504-2508).
  - "HC2 / Bell-McCaffrey small-sample SEs" is wrong for HAD: the
    mass-point path only exposes `classical` / `hc1` (HC2 / HC2_BM
    raise NotImplementedError pending 2SLS-specific leverage); the
    continuous paths use the CCT-2014 weighted-robust SE from the
    in-house lprobust port.
  - "Per-horizon WAS effects with correlated standard errors and sup-t
    bands" is wrong: per-horizon CIs are pointwise by default and have
    NO cross-horizon covariance (deferred to TODO follow-up). Sup-t
    is available ONLY on the weighted event-study path via
    `cband=True`.
  - "Multi-period event-study extension" silent on staggered handling:
    the estimator auto-filters staggered-timing panels to the
    last-treatment cohort (which retains never-treated units as
    comparisons) per paper Appendix B.2 prescription.
- Rewrite docs/api/had.rst, README.md catalog one-liner, and
  diff_diff/guides/llms.txt catalog one-liner to match REGISTRY exactly:
  Design 1' (d̲=0 / QUG) and Design 1 (d̲>0) both supported; three SE
  regimes documented (unweighted CCT-2014; `weights=` shortcut with
  classical/hc1 mass-point; `survey=` with Binder TSL); pointwise CIs
  per horizon; sup-t restricted to weighted event-study + cband=True;
  last-treatment-cohort restriction noted explicitly.

P3 - docs/api/had.rst cited Calonico/Cattaneo/Titiunik (2014) but it
was not in docs/references.rst (CONTRIBUTING.md now treats references.rst
as the bibliography source of truth)
- Drop the CCT 2014 inline citation from had.rst; the bias-correction
  primitive is an implementation detail tracked in REGISTRY.md and not
  needed at the user-facing API page level. The CCF 2018/2019 nprobust
  port reference is preserved as "ported from the nprobust machinery"
  without a specific citation, also matching how other API pages handle
  underlying numerical-method machinery.

P3 - had.rst was added to docs/api/index.rst autosummary but missed
the Module Documentation > Estimators toctree at the same file
- Add `had` between `continuous_did` and `efficient_did` in the
  Estimators toctree so the new page is navigable from the RTD API
  landing page.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 25, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: c720d5ae4d5e3589990c0bc93953862a25d1da79


Overall Assessment

✅ Looks good

Executive Summary

  • Prior P1 HAD methodology-copy issue appears resolved: the new API page, README catalog entry, and bundled llms.txt now align with the registry on Design 1' vs Design 1, last-cohort event-study scope, pointwise CIs, and weighted-only sup-t language (docs/api/had.rst:L13-L23, docs/api/had.rst:L36-L52, README.md:L113-L113, diff_diff/guides/llms.txt:L64-L64, docs/methodology/REGISTRY.md:L2261-L2263, docs/methodology/REGISTRY.md:L2513-L2516).
  • Prior API-navigation gap is resolved: had is now in the API autosummaries/toctree, and StaggeredTripleDifference now has a dedicated staggered.rst section with corrected anchors from README and llms.txt (docs/api/index.rst:L18-L31, docs/api/index.rst:L232-L244, docs/api/staggered.rst:L128-L147, README.md:L117-L117, diff_diff/guides/llms.txt:L68-L68).
  • The new dCDH HonestDiD practitioner note matches the shipped extension and caveats (diff_diff/guides/llms-practitioner.txt:L296-L308, docs/methodology/REGISTRY.md:L637-L637, diff_diff/honest_did.py:L840-L859).
  • No estimator implementation, weighting logic, variance/SE code, defaults, or assumption checks changed in this PR; this is a docs/workflow PR.
  • Remaining findings are P3 documentation issues only: one lingering HAD wording mismatch on the new bibliography page, and missing bibliography entries for newly named HAD support sources.

Methodology

  • Severity: P3. Impact: the new HAD bibliography blurb still says the estimator targets settings with “strictly positive” doses, which is narrower than the documented Design 1' d̲ = 0 case and could confuse readers who land on references.html first. Concrete fix: rewrite the HAD blurb to match the API/registry wording, e.g. nonnegative dose support with Design 1' (d̲ = 0) and Design 1 (d̲ > 0). References: docs/references.rst:L171-L173, docs/api/had.rst:L15-L17, docs/methodology/REGISTRY.md:L2261-L2263.

Code Quality
No findings.

Performance
No findings.

Maintainability
No findings.

Tech Debt
No findings. The remaining HAD llms-full.txt / tutorial follow-up is already tracked and non-blocking (TODO.md:L108-L108).

Security
No findings.

Documentation/Tests

  • Severity: P3. Impact: docs/api/had.rst newly names CCT-2014 and Binder (1983) as part of the HAD inference contract, but docs/references.rst does not include those sources, so the new bibliography page is not yet the complete scholarly source of truth this PR sets out to create. Concrete fix: add Calonico-Cattaneo-Titiunik (2014) and Binder (1983) to docs/references.rst (optionally also the Calonico-Cattaneo-Farrell 2018/2019 nprobust entries that the registry relies on), or remove those named references from docs/api/had.rst. References: docs/api/had.rst:L39-L46, diff_diff/had.py:L55-L63, docs/methodology/REGISTRY.md:L2334-L2342, docs/methodology/REGISTRY.md:L2501-L2503, docs/methodology/REGISTRY.md:L2915-L2918, docs/references.rst:L168-L173.
  • Manual review only; I did not run a Sphinx/doc build in this environment.

igerber added a commit that referenced this pull request Apr 25, 2026
…nference

P3 - HAD blurb in docs/references.rst still said "strictly positive"
- I corrected README, llms.txt, and api/had.rst in R7 but missed the
  bibliography blurb. Rewrite to match: "panels where no unit remains
  untreated...nonnegative dose...both Design 1' (d̲=0 / QUG) and
  Design 1 (d̲>0) supported."

P3 - api/had.rst names CCT-2014 and Binder (1983) but those weren't
in docs/references.rst (the new bibliography source of truth)
- Add three new sub-sections to references.rst:
  - "Nonparametric Bias-Corrected Inference" with CCT 2014 (Econometrica
    bias-combined design matrix) + CCF 2018 (JASA coverage accuracy) +
    CCF 2019 (JSS nprobust). All three back the in-house lprobust port
    used by HAD continuous-dose paths.
  - "Survey-Design Inference (Taylor-Series Linearization)" with Binder
    (1983) International Statistical Review. Backs the survey= paths
    across diff-diff (compute_survey_if_variance and per-estimator IF
    compositions including dCDH and HAD).
- These are also CONTRIBUTING.md "bibliography source of truth" entries
  for general inference machinery, not just HAD: the survey paths
  reference Binder TSL across all survey-aware estimators, and CCF
  nprobust is also used by ContinuousDiD's bias-correction path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 25, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 68341337b35c5ade6ec54578948a3cbd7e660647


Overall Assessment

✅ Looks good

Executive Summary

  • No P0 or P1 findings. This remains a docs/workflow PR; no estimator implementation, weighting logic, variance/SE code, defaults, or assumption checks changed in the diff.
  • Prior re-review documentation gaps appear resolved: docs/references.rst now includes the HAD support sources that were previously missing, including CCT-2014 and Binder (1983), at docs/references.rst:L54-L69.
  • The dCDH HonestDiD practitioner note now matches the shipped contract and registry note: diff_diff/guides/llms-practitioner.txt:L296-L308, docs/methodology/REGISTRY.md:L637-L637.
  • HAD is now properly exposed in the API index and result-class autosummaries: docs/api/index.rst:L27-L31, docs/api/index.rst:L60-L61, docs/api/index.rst:L236-L243.
  • One residual P3 documentation issue remains: the new HAD summary copy still compresses the d̲ > 0 paths into generic “targets WAS” language, while the shipped estimator contract distinguishes WAS from WAS_d_lower.

Methodology

  • Severity: P3. Impact: the new top-level HAD summaries imply that both Design 1' and Design 1 target the same WAS estimand, but the actual class contract and registry distinguish WAS on continuous_at_zero from WAS_d_lower on continuous_near_d_lower / mass_point. That is a documentation ambiguity, not an implementation bug, but it can misstate the estimand to readers landing on the overview pages. Concrete fix: rephrase the HAD summary copy in docs/api/had.rst:L13-L17, README.md:L113-L113, and diff_diff/guides/llms.txt:L64-L64 to say “WAS for Design 1' (d̲ = 0); WAS_d_lower / sign identification on d̲ > 0 paths,” or otherwise mirror the shipped contract in diff_diff/had.py:L220-L252, diff_diff/had.py:L579-L581, and docs/methodology/REGISTRY.md:L2509-L2513.

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings.

Tech Debt

  • Severity: P3-informational. Impact: HAD still lacks llms-full coverage and practitioner_next_steps() integration, but this PR explicitly tracks that deferment, so it is not a blocker. Concrete fix: none for approval; follow the tracked item in TODO.md:L108.

Security

  • No findings.

Documentation/Tests

  • No findings beyond the P3 methodology wording note above. The previous bibliography gap is fixed by docs/references.rst:L54-L69, and the new had API page is now reachable from the API index via docs/api/index.rst:L236-L243.
  • Manual review only; I did not run a Sphinx build, README render, or link checker in this environment.

igerber added a commit that referenced this pull request Apr 25, 2026
…ction

P3 - HAD summary copy compressed both Design 1' and Design 1 into "targets
WAS", but the shipped class contract distinguishes them:
- Design 1' (`d̲ = 0`, `continuous_at_zero` path): identifies the original
  Weighted Average Slope (WAS, paper Equation 2). `target_parameter == "WAS"`
  on the result class.
- Design 1 (`d̲ > 0`, `continuous_near_d_lower` or `mass_point` paths):
  identifies `WAS_{d̲}` under Assumption 6 (or sign identification only
  under Assumption 5; both are non-testable via pre-trends).
  `target_parameter == "WAS_d_lower"` on the result class.

Rewrite the HAD summaries in three places (docs/api/had.rst, README.md
catalog one-liner, diff_diff/guides/llms.txt one-liner) to surface the
distinction explicitly. The api/had.rst summary also flags that callers
can key on the resolved `target_parameter` attribute, and that Design 1
identification requires the additional non-testable assumption beyond
parallel trends.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 25, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: dd29001f62c83fa2cbbdfe15f36ea83b7bc3091f


Overall Assessment

✅ Looks good

Executive Summary

  • No P0/P1 findings. This remains a docs/workflow PR; no estimator implementation, weighting logic, variance/SE code, defaults, or assumption checks changed in the diff.
  • Previous re-review gaps look resolved: HAD is now exposed in the API index and result-class autosummaries (docs/api/index.rst:L27-L28, docs/api/index.rst:L60-L61, docs/api/index.rst:L235-L243).
  • The HAD top-level summary copy now distinguishes Design 1' WAS from Design 1 WAS_{d̲} on the public surfaces changed here (README.md:L113-L113, diff_diff/guides/llms.txt:L64-L64, docs/api/had.rst:L13-L20).
  • The bibliography split preserved the key HAD support citations, including CCT-2014 and Binder (1983) (docs/references.rst:L54-L69, docs/references.rst:L191-L193).
  • dCDH HonestDiD practitioner guidance now matches the shipped contract and registry note (diff_diff/guides/llms-practitioner.txt:L296-L308, docs/methodology/REGISTRY.md:L637-L637).
  • Residual issues are P3 documentation/maintainability only.

Methodology

  • Severity: P3. Impact: the new landing-page / concise-guide HAD catalog entries still describe the estimator as applicable to designs where no unit remains untreated, but they omit the shipped panel-only restriction. That can misroute repeated-cross-section users even though the full contract says the implementation requires a balanced panel and rejects RCS inputs. Concrete fix: append “panel-only” or “repeated cross-sections not yet supported” to README.md:L113-L113 and diff_diff/guides/llms.txt:L64-L64, mirroring docs/api/had.rst:L34-L39, diff_diff/had.py:L2800-L2810, and docs/methodology/REGISTRY.md:L2511-L2512.
  • Severity: P3. Impact: the new HAD API page summarizes weighted/survey inference too broadly for the mass-point path. It says survey= composes Binder TSL on both paths and only calls out hc2/hc2_bm as unsupported, but the shipped contract also rejects vcov_type="classical" on mass-point survey= paths and on weights= event-study with cband=True. Concrete fix: add one sentence to docs/api/had.rst:L49-L58 mirroring the deviation already documented at docs/methodology/REGISTRY.md:L2377-L2379 and enforced at diff_diff/had.py:L3364-L3375 and diff_diff/had.py:L4095-L4109.

Code Quality

No findings.

Performance

No findings.

Maintainability

  • Severity: P3. Impact: project trackers now disagree on HAD Phase 5 documentation status. TODO.md:L108-L108 says the llms.txt/README/API/references work landed in this PR, while the Methodology Registry checklist still says llms.txt remains undone (docs/methodology/REGISTRY.md:L2531-L2531). That invites review churn and conflicting guidance for future docs work. Concrete fix: update the REGISTRY checklist item to reflect the split now recorded in TODO.md.

Tech Debt

  • Severity: P3-informational. Impact: HAD still lacks llms-full.txt coverage, tutorial integration, and practitioner_next_steps() wiring, but this PR explicitly tracks that deferment in TODO.md:L108-L108, so it is non-blocking. Concrete fix: none for approval; follow the tracked item.

Security

No findings.

Documentation/Tests

No blocking findings. Manual review only; I did not run a Sphinx build, README render, or link checker in this environment.

igerber and others added 11 commits April 25, 2026 13:30
…tor conventions

The README had grown to 3,119 lines (~48K tokens) because workflow conventions
told contributors to add a per-estimator section, full usage examples, and a
scholarly references list to README on every release. Trimming the file alone
would regrow within a few releases. This PR trims AND redirects the conventions.

## README + new surfaces

- README.md: 3,119 -> 188 lines (-94%). Hero image, 6 badges (preserved), lean
  catalog, 14 H2 sections (down from 17). Tagline tightened per Nakora 2026
  "what / who / when / how-different" rubric. PyPI snippet now scannable.
- docs/references.rst: new RST page lifting the 137-line bibliography out of
  README (mechanical markdown -> RST conversion). Added to docs/index.rst
  toctree under "Getting Started".
- llms.txt: new repo-root pointer file for AI-crawler discovery; mirrors the
  RTD-hosted llms.txt / llms-full.txt / llms-practitioner.txt and lists the
  in-process get_llm_guide() modes.
- diff-diff.png: existing 1731x909 hero asset committed to git so the absolute
  raw.githubusercontent.com URL resolves on PyPI and on GitHub main post-merge.

## Convention redirects (prevent regression)

- CONTRIBUTING.md: replaced "For New Estimators: add to README" with a "README
  is a landing page, not the docs" section. New estimator workflow now routes
  to llms.txt (AI contract), docs/api/*.rst (technical), docs/references.rst
  (bibliography), docs/tutorials/ (examples), CHANGELOG.md (release notes).
  README only gets a one-line catalog entry.
- CLAUDE.md: new "README discipline" section documenting the source-of-truth
  surfaces and the landing-page constraint.
- .claude/commands/docs-check.md: replaced per-estimator "README Section"
  requirement with "llms.txt entry + flat-catalog one-liner". Scholarly
  references check redirected from README to docs/references.rst. All 18
  estimators retabled.
- .claude/commands/dev-checklists.md, pre-merge-check.md, review-pr.md,
  review-plan.md: Documentation Sync checklists now name llms.txt /
  docs/api/ / docs/references.rst first; README only for landing-page changes.
- docs/doc-deps.yaml: 18 stale README.md section pointers (per-estimator
  sections that no longer exist) redirected to "Estimators (one-line catalog
  entry)" / "For Data Scientists" / "Survey Support". /docs-impact will now
  flag the right action when adding a new estimator.

## SEO + discoverability

External research (pyOpenSci, Nakora 2026, OtterlyAI GEO study, scikit-learn
benchmark) drove three small additions: hero image with keyword-rich alt
text, sharper tagline, repo-root llms.txt. Repo metadata changes done out of
band (3 new GitHub topics: synthetic-control / survey-design / geo-experiments
/ marketing-analytics; "python" topic dropped as redundant signal; custom
social preview image uploaded).

## Verification

- README is 188 lines (target 165-220).
- 14 H2 sections (target 14).
- All 30 RTD links in README resolve (excluding new references.html which
  resolves after RTD rebuild).
- All 3 RTD links in /llms.txt resolve.
- get_llm_guide() unchanged (13,166 chars, smoke-tested).
- PyPI readme-renderer produces valid HTML with hero image and all 14 H2s.
- Hero image embedded in first 20 lines with all 4 marquee estimator names
  + "Difference-in-Differences" in alt text; required blank line after </p>.
- docs/references.rst parses cleanly via docutils (no warnings).
- Zero em dashes across README / references.rst / llms.txt.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P1 - docs/references.rst incomplete (5 missing citations)
- Add Baker, Larcker, Wang & Wang (2025) - "Difference-in-Differences Designs:
  A Practitioner's Guide" arXiv:2503.13323; backs the 8-step Practitioner
  Workflow now surfaced in README and llms-practitioner.txt.
- Add Callaway, Goodman-Bacon & Sant'Anna (2024) - "Difference-in-Differences
  with a Continuous Treatment" NBER WP 32117; backs ContinuousDiD (under new
  "Continuous Treatment DiD" section).
- Add Chen, Sant'Anna & Xie (2025) - "Efficient Difference-in-Differences and
  Event Study Estimators"; backs EfficientDiD.
- Add de Chaisemartin & D'Haultfoeuille (2022, revised 2024) NBER WP 29873;
  dynamic companion to dCDH 2020, backs the cohort-recentered plug-in
  variance formula.
- Add Wooldridge (2025) Empirical Economics + Wooldridge (2023) Econometrics
  Journal under new "Wooldridge ETWFE" sub-section; backs WooldridgeDiD's
  saturated OLS and logit/Poisson QMLE paths.

P1 - .claude/commands/docs-check.md
- Replace broken `awk '/^## Estimators/,/^## /'` (self-terminates on opening
  H2) with a flag-based extractor that skips the heading and stops at the
  next H2.
- Special-case diagnostic-class items (HonestDiD): the workflow text now
  routes them to `## Diagnostics & Sensitivity` in both llms.txt and README,
  matching the existing table.

P2 - docs/doc-deps.yaml
- Retarget HonestDiD's README entry from "Estimators (one-line catalog
  entry)" to "Diagnostics & Sensitivity (one-line entry)" so /docs-impact
  flags the correct README section.
- Add `- path: docs/references.rst` (type: user_guide) to all 18 method-
  backed source modules so /docs-impact and /pre-merge-check warnings
  surface bibliography updates when source code changes.

P2 - README.md absolute links
- L83 docs/methodology/REPORTING.md, L182 CITATION.cff, L184 CONTRIBUTORS.md
  switched to absolute https://github.com/igerber/diff-diff/blob/main/...
  URLs so they resolve on PyPI (which does not auto-resolve relative paths).

P3 - README.md tutorial count
- Drop the "18" hard-coded count in the Tutorials bullet to prevent future
  drift; repo currently has 19 notebooks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P2 - docs/doc-deps.yaml partially migrated to new doc-surface contract
- Add `diff_diff/guides/llms.txt` entries to 12 estimator blocks that
  already had llms-full.txt (CallawaySantAnna, SunAbraham, ImputationDiD,
  TwoStageDiD, EfficientDiD, ContinuousDiD, StackedDiD, SyntheticDiD,
  TripleDifference, TROP, HonestDiD, WooldridgeDiD, ChaisemartinDHaultfoeuille,
  Survey). Section labels match the actual llms.txt heading the entry
  belongs under: "Estimators", "Diagnostics and Sensitivity Analysis"
  (HonestDiD), "Survey Support" (survey).
- Backfill bacon.py with llms-full.txt + llms.txt entries (and the missing
  02_staggered_did tutorial reference for Bacon decomposition).
- Backfill staggered_triple_diff.py with full surface mapping
  (16_survey_did tutorial, README, references.rst, llms-full.txt, llms.txt).
- Net: all 17 method-backed estimator sources now have the required
  README + llms.txt + llms-full.txt + references.rst entries. BusinessReport
  / DiagnosticReport intentionally lack llms.txt entries (not in llms.txt
  per their experimental-preview status).

P2 - StaggeredTripleDifference API doc target
- Add `.. autoclass:: diff_diff.StaggeredTripleDifference` and
  `StaggeredTripleDiffResults` sections to docs/api/staggered.rst.
- Update docs-check.md table row from index.rst (where it was a false-pass)
  to staggered.rst (where the autoclass actually lives now).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P2 - StaggeredTripleDifference catalog link points at api/index.html which
does not list it
- Update README.md L116 catalog link from `api/index.html` to
  `api/staggered.html#staggeredtripledifference` (where the autoclass added
  in the prior commit actually renders).
- Sync the same fix in diff_diff/guides/llms.txt L67 so the bundled
  AI-agent guide and the README agree on the canonical URL.

P3 - docs/api/staggered.rst intro counted "two main estimators" but page
now also documents StaggeredTripleDifference (added in prior commit)
- Update intro to list three estimators (CS, SA, StaggeredTripleDifference)
  and clarify the robustness-check note applies to CS+SA only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
(Branch rebased onto origin/main first - 32 landed commits replayed cleanly,
zero file overlap with the docs-refresh changeset.)

P2 - llms-practitioner.txt Step 6 inconsistent on dCDH HonestDiD support
- The high-level workflow surfaces (README, bundled llms.txt) and the
  practitioner guide's estimator tree (L177-180) all advertise
  compute_honest_did(results) on MultiPeriodDiD / CallawaySantAnna /
  ChaisemartinDHaultfoeuille, but Step 6 still said "MultiPeriodDiD and
  CallawaySantAnna only".
- Update Step 6 to add dCDH with the registry caveats: requires L_max >= 1,
  bounds use placebo estimates as pre-period coefficients (not standard
  event-study pre-treatment coefficients), uses diagonal variance (no full
  VCV available), emits UserWarning. Cross-references the REGISTRY.md
  "Note (HonestDiD integration)" entry under ChaisemartinDHaultfoeuille
  for the full caveat list.

P3 - docs/doc-deps.yaml unsectioned llms.txt entries
- Add `section: "Estimators"` to the 5 remaining unsectioned
  `diff_diff/guides/llms.txt` mappings (estimators.py, staggered.py,
  chaisemartin_dhaultfoeuille.py, results.py, __init__.py). Now every
  llms.txt mapping in doc-deps.yaml carries a section label, matching the
  new "first-class documentation surface" contract.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P2 - docs/references.rst misattributed Baker (2025) "A Practitioner's Guide"
- Wrong author list "Baker, A. C., Larcker, D. F., Wang, C. C. Y., & Wang, X."
  conflates the 2022 JFE staggered-DiD paper (Baker/Larcker/Wang) with the
  2025 arXiv:2503.13323 practitioner's guide. Per the canonical citation in
  diff_diff/guides/llms-practitioner.txt L610-614 and arXiv metadata, the
  actual authors of arXiv:2503.13323 are Baker, Callaway, Cunningham,
  Goodman-Bacon & Sant'Anna.
- Fix: replace the author list (title and arXiv ID unchanged).
- Audit: grep for `Baker.*Larcker|Larcker.*Wang` across docs/, README.md,
  llms.txt, diff_diff/guides/, CHANGELOG.md, CLAUDE.md, CONTRIBUTING.md
  shows no other misattributions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…cross all canonical surfaces

P2 - HeterogeneousAdoptionDiD shipped (Phases 1a-3 + 3-followup + 4.5 A/B/C0
merged via PRs #366/#367) but missing from the new authoritative documentation
surfaces this PR establishes
- README.md ## Estimators catalog: add one-line entry after ContinuousDiD
  (de Chaisemartin, Ciccia, D'Haultfoeuille & Knau 2026; alias HAD).
- diff_diff/guides/llms.txt ## Estimators: add matching one-liner.
- .claude/commands/docs-check.md required-estimators table: add HAD row
  pointing at had.rst as the API target.
- docs/references.rst: add new "Heterogeneous Adoption (No-Untreated
  Designs)" sub-section with the de Chaisemartin et al. (2026)
  arXiv:2405.04465v6 citation.
- docs/api/index.rst: add HeterogeneousAdoptionDiD to estimators
  autosummary; add HeterogeneousAdoptionDiDResults +
  HeterogeneousAdoptionDiDEventStudyResults to results autosummary.
- docs/api/had.rst (NEW): autoclass page for the three classes with a
  brief intro and a "When to use HAD" note pointing at sibling estimators
  (ContinuousDiD for never-treated controls, dCDH for binary reversible).
- docs/doc-deps.yaml: add diff_diff/had.py + diff_diff/had_pretests.py
  source mappings (REGISTRY methodology, had.rst api_reference, README
  catalog, references.rst, llms.txt). The llms-full.txt mapping is
  intentionally omitted with a comment - that section is deferred to the
  Phase 5 follow-up tracked in TODO.md.

TODO.md: narrow the Phase 5 entry from "llms.txt updates" to "llms-full.txt
HeterogeneousAdoptionDiD section" since the catalog one-liner, API page,
and bibliography landed here. practitioner_next_steps() integration and
tutorial notebook remain deferred.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ctree

P1 - HAD descriptions disagreed with REGISTRY.md and shipped code
- Reviewer caught factual errors I introduced when drafting HAD copy:
  - "Strictly positive heterogeneous dose" excludes Design 1' (the QUG
    case where d̲ = 0 is allowed; mass-point fraction at d̲ matters
    for the support-plausibility heuristic but `d.min() == 0` is the
    canonical Design 1' invariant per REGISTRY L2504-2508).
  - "HC2 / Bell-McCaffrey small-sample SEs" is wrong for HAD: the
    mass-point path only exposes `classical` / `hc1` (HC2 / HC2_BM
    raise NotImplementedError pending 2SLS-specific leverage); the
    continuous paths use the CCT-2014 weighted-robust SE from the
    in-house lprobust port.
  - "Per-horizon WAS effects with correlated standard errors and sup-t
    bands" is wrong: per-horizon CIs are pointwise by default and have
    NO cross-horizon covariance (deferred to TODO follow-up). Sup-t
    is available ONLY on the weighted event-study path via
    `cband=True`.
  - "Multi-period event-study extension" silent on staggered handling:
    the estimator auto-filters staggered-timing panels to the
    last-treatment cohort (which retains never-treated units as
    comparisons) per paper Appendix B.2 prescription.
- Rewrite docs/api/had.rst, README.md catalog one-liner, and
  diff_diff/guides/llms.txt catalog one-liner to match REGISTRY exactly:
  Design 1' (d̲=0 / QUG) and Design 1 (d̲>0) both supported; three SE
  regimes documented (unweighted CCT-2014; `weights=` shortcut with
  classical/hc1 mass-point; `survey=` with Binder TSL); pointwise CIs
  per horizon; sup-t restricted to weighted event-study + cband=True;
  last-treatment-cohort restriction noted explicitly.

P3 - docs/api/had.rst cited Calonico/Cattaneo/Titiunik (2014) but it
was not in docs/references.rst (CONTRIBUTING.md now treats references.rst
as the bibliography source of truth)
- Drop the CCT 2014 inline citation from had.rst; the bias-correction
  primitive is an implementation detail tracked in REGISTRY.md and not
  needed at the user-facing API page level. The CCF 2018/2019 nprobust
  port reference is preserved as "ported from the nprobust machinery"
  without a specific citation, also matching how other API pages handle
  underlying numerical-method machinery.

P3 - had.rst was added to docs/api/index.rst autosummary but missed
the Module Documentation > Estimators toctree at the same file
- Add `had` between `continuous_did` and `efficient_did` in the
  Estimators toctree so the new page is navigable from the RTD API
  landing page.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nference

P3 - HAD blurb in docs/references.rst still said "strictly positive"
- I corrected README, llms.txt, and api/had.rst in R7 but missed the
  bibliography blurb. Rewrite to match: "panels where no unit remains
  untreated...nonnegative dose...both Design 1' (d̲=0 / QUG) and
  Design 1 (d̲>0) supported."

P3 - api/had.rst names CCT-2014 and Binder (1983) but those weren't
in docs/references.rst (the new bibliography source of truth)
- Add three new sub-sections to references.rst:
  - "Nonparametric Bias-Corrected Inference" with CCT 2014 (Econometrica
    bias-combined design matrix) + CCF 2018 (JASA coverage accuracy) +
    CCF 2019 (JSS nprobust). All three back the in-house lprobust port
    used by HAD continuous-dose paths.
  - "Survey-Design Inference (Taylor-Series Linearization)" with Binder
    (1983) International Statistical Review. Backs the survey= paths
    across diff-diff (compute_survey_if_variance and per-estimator IF
    compositions including dCDH and HAD).
- These are also CONTRIBUTING.md "bibliography source of truth" entries
  for general inference machinery, not just HAD: the survey paths
  reference Binder TSL across all survey-aware estimators, and CCF
  nprobust is also used by ContinuousDiD's bias-correction path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ction

P3 - HAD summary copy compressed both Design 1' and Design 1 into "targets
WAS", but the shipped class contract distinguishes them:
- Design 1' (`d̲ = 0`, `continuous_at_zero` path): identifies the original
  Weighted Average Slope (WAS, paper Equation 2). `target_parameter == "WAS"`
  on the result class.
- Design 1 (`d̲ > 0`, `continuous_near_d_lower` or `mass_point` paths):
  identifies `WAS_{d̲}` under Assumption 6 (or sign identification only
  under Assumption 5; both are non-testable via pre-trends).
  `target_parameter == "WAS_d_lower"` on the result class.

Rewrite the HAD summaries in three places (docs/api/had.rst, README.md
catalog one-liner, diff_diff/guides/llms.txt one-liner) to surface the
distinction explicitly. The api/had.rst summary also flags that callers
can key on the resolved `target_parameter` attribute, and that Design 1
identification requires the additional non-testable assumption beyond
parallel trends.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
(Branch rebased onto origin/main first - 5 landed commits replayed cleanly,
zero file overlap with the docs-refresh changeset.)

P3 - HAD catalog one-liners omit panel-only restriction
- Both README.md L113 and diff_diff/guides/llms.txt L64 describe the
  estimator without noting that the validator rejects repeated
  cross-section inputs (per REGISTRY L2511-2512 and had.py
  _validate_had_panel). Append "Panel-only in this release - repeated
  cross-sections rejected by the validator" so practitioners with RCS
  data are not misrouted.

P3 - api/had.rst inference contract under-specifies mass-point survey
vcov rejections
- Reviewer caught that the inference contract section says only HC2 /
  HC2_BM are unsupported, but the shipped code also rejects
  vcov_type="classical" on:
    - design="mass_point" + survey= (static and event-study)
    - design="mass_point" + weights= + aggregate="event_study" + cband=True
  The HC1-scaled per-unit 2SLS IF would silently target V_HC1 under a
  classical label.
- Add a paragraph to api/had.rst mirroring REGISTRY L2377-L2379 and
  the front-door NotImplementedError messages at had.py L3360-3380 and
  L4090-4115 (pointing users to vcov_type="hc1" or robust=True).

P3 - REGISTRY checklist disagrees with TODO.md on HAD Phase 5 status
- TODO.md L108 says the README catalog / llms.txt / API page /
  references.rst portion of HAD Phase 5 landed in PR #372, but
  REGISTRY L2531 still says "Phase 5: Tutorial notebook + llms.txt +
  llms-full.txt updates" as one undone item.
- Split the REGISTRY checklist row: mark the PR #372 portion as [x]
  with explicit scope (README catalog, bundled llms.txt, docs/api/had.rst,
  docs/references.rst), keep the remaining tutorial + llms-full.txt
  work as [ ].

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 25, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: a21cd5d045772eb5deb601fd426f9c6dd401201f


Overall Assessment

✅ Looks good

No unmitigated P0/P1 findings. This remains a docs/workflow/asset PR; the methodology-facing changes are mostly aligned with the shipped HAD and dCDH contracts.

Executive Summary

  • Previous re-review gaps appear resolved: HAD now states the panel-only restriction on the public catalog surfaces, the new HAD API page documents the mass-point vcov_type="classical" limitation, and the Phase 5 tracker entries are aligned across TODO.md and REGISTRY.md (README.md:L113-L113, diff_diff/guides/llms.txt:L64-L64, docs/api/had.rst:L61-L71, TODO.md:L108-L108, docs/methodology/REGISTRY.md:L2531-L2532).
  • The dCDH HonestDiD update in the practitioner guide matches the shipped contract: dCDH requires L_max >= 1, uses placebo coefficients as pre-periods, uses diagonal variance, and emits a runtime warning (diff_diff/guides/llms-practitioner.txt:L296-L308, docs/methodology/REGISTRY.md:L637-L637, diff_diff/honest_did.py:L843-L968).
  • No estimator implementation, weighting logic, variance code, defaults, or assumption checks changed in this PR.
  • Severity: P3 [Newly identified]. A stale HAD note remains in the Methodology Registry claiming weights= / survey= raise NotImplementedError, which now contradicts both the new API page and the implementation (docs/methodology/REGISTRY.md:L2511-L2512 vs docs/api/had.rst:L43-L71, diff_diff/had.py:L3364-L3375, diff_diff/had.py:L4095-L4109).
  • Severity: P3. The new dCDH HonestDiD caveat in the bundled practitioner guide points offline get_llm_guide("practitioner") users to docs/methodology/REGISTRY.md, but that file is not bundled in the wheel (diff_diff/guides/llms-practitioner.txt:L303-L308, README.md:L61-L72, pyproject.toml:L96-L98).

Methodology

  • Severity: P3 [Newly identified]. Impact: docs/methodology/REGISTRY.md still has a stale Phase 2a/2b note saying HAD weights= and survey= raise NotImplementedError, which conflicts with the newly added HAD API reference and current weighted/survey implementation. That creates contradictory methodology guidance in the project’s own source of truth. Concrete fix: update or explicitly mark the historical note at docs/methodology/REGISTRY.md:L2511-L2512 as superseded by the current Phase 4.5 contract already documented at docs/methodology/REGISTRY.md:L2340-L2379 and surfaced in docs/api/had.rst:L43-L71.

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings.

Tech Debt

  • Severity: P3-informational. Impact: Remaining HAD follow-up work (practitioner_next_steps(), tutorial, llms-full.txt) is explicitly tracked, so it is non-blocking under the project’s deferred-work policy (TODO.md:L108-L110, docs/methodology/REGISTRY.md:L2530-L2534). Concrete fix: none for approval; keep the tracker entries synchronized as the remaining Phase 5 work lands.

Security

  • No findings.

Documentation/Tests

  • Severity: P3. Impact: the new dCDH HonestDiD caveat in diff_diff/guides/llms-practitioner.txt sends readers to docs/methodology/REGISTRY.md, but wheel packaging only includes diff_diff/guides/*.txt. That makes the reference dead for the exact offline get_llm_guide("practitioner") workflow the README advertises as no-network (diff_diff/guides/llms-practitioner.txt:L303-L308, README.md:L61-L72, pyproject.toml:L96-L98). Concrete fix: inline the caveats directly in llms-practitioner.txt, or replace the repo-path pointer with a public RTD URL and note that it requires network access.
  • Validation note: I did not run a Sphinx build or import smoke test in this environment; import diff_diff fails here because runtime dependencies are not installed (numpy missing).

…actitioner caveats

P3 - REGISTRY.md L2511 had a stale "Phase 2a/2b scope" note saying
HAD `weights=` and `survey=` raise NotImplementedError, contradicting
both the new HAD API page and the implementation (Phase 4.5 A/B/C0
lifted both gates with the per-design vcov contract documented at
L2340-L2379).
- Mark the note as "superseded by Phase 4.5" and explicitly cross-
  reference the current contract block (mass-point vcov_type="classical"
  deviation, cband=True sup-t restriction, Binder TSL via
  compute_survey_if_variance on both IFs).

P3 - dCDH HonestDiD caveat in diff_diff/guides/llms-practitioner.txt
pointed offline get_llm_guide("practitioner") users at
docs/methodology/REGISTRY.md, but the wheel only ships
diff_diff/guides/*.txt (per pyproject.toml). The pointer is dead for
the no-network workflow the README advertises.
- Replace the pointer with inline expansion of the most load-bearing
  caveats: DeltaRM Mbar=1.0 + l_vec=None default targets the equal-
  weight average over post horizons; R's HonestDiD defaults to the
  first post effect (pass l_vec for parity); trends_linear=True bounds
  the second-differenced estimand; trends_nonparam grid gaps trigger a
  largest-consecutive-block filter with warning.
- Practitioners running `get_llm_guide("practitioner")` now get the
  full applicable caveats without needing network access or a
  source-tree clone.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 25, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 41107c85187490fc5af6e89966d4788170ffd04c


Overall Assessment

✅ Looks good

Executive Summary

  • Previous re-review issues appear resolved: the HAD registry note now marks the old survey= / weights= limitation as superseded, and the bundled practitioner guide now inlines the dCDH HonestDiD caveats in a way that matches the implementation (docs/methodology/REGISTRY.md:L2507-L2512, diff_diff/guides/llms-practitioner.txt:L296-L314, diff_diff/honest_did.py:L840-L980).
  • No estimator implementation, weighting logic, variance code, defaults, or assumption checks changed in this diff.
  • Severity P2. The new README survey section overstates a uniform survey API/weight-type contract. README.md:L129-L137 says all estimators accept survey_design with pweight/fweight/aweight, but HeterogeneousAdoptionDiD.fit() uses survey= / weights= (diff_diff/had.py:L2777-L2788) and rejects non-pweight survey designs on the continuous path (diff_diff/had.py:L2984-L2994); other survey-aware estimators such as Callaway-Sant'Anna and StackedDiD are also pweight-only (diff_diff/staggered.py:L1497-L1504, diff_diff/stacked_did.py:L249-L260).
  • Severity P3-informational. Remaining HAD Phase 5 guide gaps are explicitly tracked, so they are non-blocking (docs/doc-deps.yaml:L366-L384, TODO.md:L108-L110).

Methodology

  • No findings. The changed HAD docs now align with the registry and implementation on the panel-only restriction, Design 1 / Design 1' targets, and the documented mass-point vcov_type="classical" deviation (docs/methodology/REGISTRY.md:L2340-L2379, docs/api/had.rst:L43-L71, diff_diff/had.py:L3317-L3376, diff_diff/had.py:L4074-L4110).
  • No findings. The new dCDH HonestDiD practitioner text matches code: L_max >= 1, placebo horizons as pre-period coefficients, diagonal variance, and consecutive-block filtering with warnings (diff_diff/guides/llms-practitioner.txt:L296-L314, diff_diff/honest_did.py:L840-L980).

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings.

Tech Debt

  • Severity P3-informational. Impact: the PR correctly leaves the remaining HAD offline-guide/practitioner integration work tracked rather than silently presenting it as complete. Concrete fix: none for approval; keep the planned Phase 5 follow-up scoped to diff_diff/guides/llms-full.txt, tutorials, and practitioner_next_steps() as already recorded in docs/doc-deps.yaml:L366-L384 and TODO.md:L108-L110.

Security

  • No findings.

Documentation/Tests

  • Severity P2. Impact: the new README survey section advertises a public contract the codebase does not uniformly expose. A reader following README.md:L129-L137 can reasonably try survey_design=SurveyDesign(..., weight_type="aweight") on any estimator, but the newly added HAD API docs and implementation use survey= / weights= instead of survey_design= (docs/api/had.rst:L43-L71, diff_diff/had.py:L2777-L2788), and multiple estimators reject non-pweight designs (diff_diff/had.py:L2984-L2994, diff_diff/staggered.py:L1497-L1504, diff_diff/stacked_did.py:L249-L260). Concrete fix: rewrite the landing-page copy to say survey support varies by estimator, call out the HAD survey= / weights= exception explicitly, and defer supported weight types to the compatibility matrix.
  • Validation note: I did not run a Sphinx build or README renderer here; I only did targeted source cross-checks and a local parse of docs/doc-deps.yaml.

…uniform contract

P2 - README.md L131-137 advertised "all estimators accept survey_design"
with "weight types pweight/fweight/aweight" as a uniform contract, but:
- HeterogeneousAdoptionDiD.fit() uses survey= / weights= (not survey_design=)
- HAD continuous path rejects non-pweight survey designs
- CallawaySantAnna and StackedDiD are also pweight-only
A reader could reasonably try `SurveyDesign(..., weight_type="aweight")`
on any estimator and hit a NotImplementedError.

Rewrite the Survey Support section to:
- Say "Most" not "All" estimators accept survey_design, with explicit
  callout of HAD's distinct survey= / weights= API
- Defer per-estimator weight-type support to the existing compatibility
  matrix on RTD instead of claiming uniform pweight/fweight/aweight
- Note explicitly that some surfaces (CS, StackedDiD, HAD continuous)
  are pweight-only while others accept all three weight types
- Cite Binder (1983) for TSL alongside the methods list

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 25, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 06384266166c82d013f44b68396e26c456947996


Overall Assessment

✅ Looks good

Executive Summary

  • The prior README survey-contract issue appears fixed at the landing-page level: README.md:L129-L133 now says “most estimators,” explicitly calls out HAD’s survey= / weights= exception, and defers weight-type details to the compatibility matrix.
  • No estimator implementation, weighting logic, variance code, defaults, or assumption checks changed in this diff.
  • The new HAD API page and bibliography entry align with the registry/code on panel-only scope, WAS vs WAS_d_lower, pointwise event-study CIs, and the documented weighted mass-point inference restrictions.
  • The dCDH HonestDiD practitioner-guide update matches the implementation on L_max >= 1, placebo-based pre-period coefficients, diagonal variance, equal-weight default targeting, and consecutive-block filtering.
  • Remaining P2: the PR elevates diff_diff/guides/llms.txt to “source of truth” status, but that file still overstates a uniform survey_design API / weight-type contract and the linked survey matrix still omits HAD.
  • Remaining HAD Phase 5 docs gaps are explicitly tracked in TODO.md, so they are non-blocking.

Methodology

No findings. The new HAD docs match the existing registry/code contract on panel-only support, WAS vs WAS_d_lower, pointwise event-study CIs, and the weighted mass-point vcov_type="classical" restriction (docs/api/had.rst:L13-L71, docs/methodology/REGISTRY.md:L2340-L2379, diff_diff/had.py:L2777-L2879). The dCDH HonestDiD practitioner text also matches the implementation on L_max >= 1, placebo-based pre-period coefficients, diagonal variance, equal-weight default l_vec=None, and consecutive-block filtering (diff_diff/guides/llms-practitioner.txt:L296-L314, diff_diff/honest_did.py:L840-L968).

Code Quality

No findings.

Performance

No findings.

Maintainability

No findings.

Tech Debt

  • P3 Impact: Remaining HAD documentation work is explicitly deferred rather than silently omitted: the outstanding llms-full.txt / tutorial / practitioner_next_steps() work is tracked in TODO.md:L108-L110 and docs/methodology/REGISTRY.md:L2530-L2532. Concrete fix: none for approval; keep the planned Phase 5 follow-up scoped to those surfaces.

Security

No findings.

Documentation/Tests

  • P2 Impact: The PR now declares diff_diff/guides/llms.txt the AI-agent “source of truth” / canonical machine-readable contract (CONTRIBUTING.md:L21-L25, CLAUDE.md:L121-L130, .claude/commands/docs-check.md:L21-L29), but that guide still says “All estimators accept an optional survey_design parameter” and advertises generic pweight/fweight/aweight support (diff_diff/guides/llms.txt:L100-L108). That is false for HeterogeneousAdoptionDiD, whose public fit API is survey= / weights= rather than survey_design= (diff_diff/had.py:L2777-L2788), and the compatibility matrix the new README points readers to still omits HAD entirely (README.md:L129-L133, docs/choosing_estimator.rst:L694-L706, docs/choosing_estimator.rst:L715-L805). Concrete fix: update diff_diff/guides/llms.txt to mirror the README wording (“most estimators,” explicit HAD exception, per-estimator weight-type caveat) and either add HAD to the survey-support matrix or stop referring to that matrix as the complete per-estimator table until it covers HAD.

Validation note: I did not run a Sphinx build or README renderer; this review is based on targeted source cross-checks of the diff and surrounding implementation.

@igerber igerber added the ready-for-ci Triggers CI test workflows label Apr 25, 2026
@igerber igerber merged commit 903d3ab into main Apr 25, 2026
23 of 24 checks passed
@igerber igerber deleted the docs-refresh branch April 25, 2026 19:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-ci Triggers CI test workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant